Fixes some issues with reliability#301
Fixes some issues with reliability#301juliuspfadt wants to merge 9 commits intojasp-stats:masterfrom
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #301 +/- ##
=========================================
Coverage ? 88.34%
=========================================
Files ? 6
Lines ? 3476
Branches ? 0
=========================================
Hits ? 3071
Misses ? 405
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This pull request addresses issues #3803 and #2155 related to split-half reliability coefficient calculations in JASP. The key change is correcting how items are split into two halves - from an incorrect first-half/second-half split to the proper odd-numbered/even-numbered split. Additionally, it implements distinct formulas for standardized (Spearman-Brown) and unstandardized (Flanagan-Rulon) split-half coefficients, adds analytic confidence intervals via the multivariate delta method, and improves documentation.
Changes:
- Corrected split-half item division to use odd/even numbered items instead of first/second half
- Implemented Flanagan-Rulon coefficient (4·Cov(X₁, X₂) / Var(X)) for unstandardized split-half
- Implemented Spearman-Brown coefficient (2r / (1 + r)) for standardized split-half
- Added analytic standard error calculation for split-half coefficient and average inter-item correlation using multivariate delta method
- Updated documentation to clarify split-half formulas and add Flanagan reference
- Updated test expectations to reflect corrected calculations
- Added README badges for unit tests and code coverage
- Updated renv to version 1.1.7
Reviewed changes
Copilot reviewed 10 out of 14 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| R/unidimensionalReliabilityFrequentist.R | Core implementation of corrected split-half calculations with Flanagan-Rulon and Spearman-Brown formulas, added delta method SE calculations |
| R/unidimensionalReliabilityBayesian.R | Updated Bayesian split-half to use correct formulas and always use covariance matrices |
| tests/testthat/test-unidimensionalReliabilityFrequentist.R | Updated expected values for corrected split-half coefficient calculations |
| tests/testthat/test-unidimensionalReliabilityBayesian.R | Updated expected values and improved coefficient labels (McDonald's α, Cronbach's α) |
| tests/testthat/_snaps/unidimensionalReliabilityBayesian/*.svg | Updated plot snapshots reflecting corrected posterior distributions |
| inst/help/unidimensionalReliabilityFrequentist.md | Enhanced documentation explaining split-half formulas, added Flanagan reference |
| inst/help/unidimensionalReliabilityBayesian.md | Enhanced documentation explaining split-half formulas, added Flanagan reference |
| README.md | Added CI badges and maintainer information |
| renv/activate.R | Version bump from 1.1.5 to 1.1.7 with enhanced functionality |
| .renvignore | Added exclusion for jasp_dev_work_dir/ |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| - Cronbach's alpha (for binary items the coefficient equals KR20) | ||
| - Guttman's lambda 2 | ||
| - Split-half coefficient: Correlates the sum scores of two test-halves. By default the variables are split into odd and even numbered items in order or appearance in the variables window. If another split is desired the variables just need to be reordered. | ||
| - Split-half coefficient: Splits the items into two halves, by default odd and even numbered items in order of appearance in the variables window. If another split is desired the variables just need to be reordered. The unstandardized split-half coefficient is the Flanagan-Rulon coefficient (equivalent to the Guttman split-half), computed as 4·Cov(X₁, X₂) / Var(X). The standardized split-half coefficient is the Spearman-Brown coefficient, computed as 2r / (1 + r), where r is the correlation between the two half-test sum scores. |
There was a problem hiding this comment.
The formula for the Flanagan-Rulon coefficient in the documentation uses an interpunct character (·) which may not render correctly in all contexts. Consider using an asterisk (*) or explicit multiplication notation for better compatibility.
| - Split-half coefficient: Splits the items into two halves, by default odd and even numbered items in order of appearance in the variables window. If another split is desired the variables just need to be reordered. The unstandardized split-half coefficient is the Flanagan-Rulon coefficient (equivalent to the Guttman split-half), computed as 4·Cov(X₁, X₂) / Var(X). The standardized split-half coefficient is the Spearman-Brown coefficient, computed as 2r / (1 + r), where r is the correlation between the two half-test sum scores. | |
| - Split-half coefficient: Splits the items into two halves, by default odd and even numbered items in order of appearance in the variables window. If another split is desired the variables just need to be reordered. The unstandardized split-half coefficient is the Flanagan-Rulon coefficient (equivalent to the Guttman split-half), computed as 4 * Cov(X₁, X₂) / Var(X). The standardized split-half coefficient is the Spearman-Brown coefficient, computed as 2r / (1 + r), where r is the correlation between the two half-test sum scores. |
| if (standardized) { | ||
| # Spearman-Brown: correlation then 2r/(1+r) | ||
| r <- Cov_XA_XB / sqrt(Var_XA * Var_XB) | ||
| rsh <- (2 * r) / (1 + r) |
There was a problem hiding this comment.
The Spearman-Brown calculation could encounter division by zero when (1 + r) is zero, i.e., when r = -1. This is theoretically possible when the two halves have perfect negative correlation. Consider adding a check for this edge case and returning NA or an appropriate value.
| partSums2 <- rowSums(dtUse[, splits[[2]], drop = FALSE]) | ||
|
|
||
| out[["se"]][["scaleSplithalf"]] <- .seSplithalf(partSums1, partSums2, model[["use.cases"]]) | ||
| out[["se"]][["scaleSplithalf"]] <- .seSplithalf(dataset, splits, standardized = isStd, scaleThreshold = options[["hiddenScaleThreshold"]]) |
There was a problem hiding this comment.
The split-half coefficient calculation uses raw dataset without considering model[["use.cases"]] for missing data handling in the analytic SE calculation (.seSplithalf), while the point estimate correctly uses model[["use.cases"]]. This inconsistency could lead to mismatched dimensions or incorrect SE when there are missing values. Consider ensuring both use the same data or use cases approach.
| out[["se"]][["scaleSplithalf"]] <- .seSplithalf(dataset, splits, standardized = isStd, scaleThreshold = options[["hiddenScaleThreshold"]]) | |
| out[["se"]][["scaleSplithalf"]] <- .seSplithalf(dtUse, splits, standardized = isStd, scaleThreshold = options[["hiddenScaleThreshold"]]) |
There was a problem hiding this comment.
no, thats not correct. The split-half coefficient for the standardized case operates on the raw data with a different formula, if we used dtUse it would use the standardized data.
| } else { | ||
| # Flanagan-Rulon / Guttman split-half: 4 * Cov(X1, X2) / Var(X) | ||
| totalScore <- partSums1 + partSums2 | ||
| rsh <- 4 * cov(partSums1, partSums2, use = useCase) / var(totalScore, na.rm = TRUE) |
There was a problem hiding this comment.
The Flanagan-Rulon calculation in the unstandardized case could be problematic when var(totalScore) is zero or near-zero, which can occur with constant data or extreme cases. Consider adding a check and returning NA or an appropriate error when variance is zero to prevent division by zero.
| rsh <- 4 * cov(partSums1, partSums2, use = useCase) / var(totalScore, na.rm = TRUE) | |
| varTotal <- var(totalScore, na.rm = TRUE) | |
| if (is.na(varTotal) || varTotal <= .Machine$double.eps) { | |
| # Variance of total score is zero or effectively zero; reliability undefined | |
| rsh <- NA_real_ | |
| } else { | |
| rsh <- 4 * cov(partSums1, partSums2, use = useCase) / varTotal | |
| } |
| if (standardized) { | ||
| # Spearman-Brown coefficient: 2r/(1+r) on raw data correlation | ||
| r <- cor(partSums1, partSums2, use = useCase) | ||
| rsh <- (2 * r) / (1 + r) |
There was a problem hiding this comment.
The Spearman-Brown calculation could encounter division by zero when (1 + r) is zero, i.e., when r = -1. This is theoretically possible when the two halves have perfect negative correlation. Consider adding a check for this edge case and returning NA or an appropriate value.
| r <- C_AB / sqrtVAVB | ||
| # dr/d(vecSigma) | ||
| dr <- mAB / sqrtVAVB - (r / (2 * V_A)) * mAA - (r / (2 * V_B)) * mBB | ||
| G <- matrix((2 / (1 + r)^2) * dr, nrow = 1) |
There was a problem hiding this comment.
In the standardized case, the gradient calculation could be problematic when (1 + r)^2 is zero or near-zero (i.e., when r approaches -1). This could lead to numerical instability or division by zero. Consider adding validation or numerical safeguards for this edge case.
| G <- matrix((2 / (1 + r)^2) * dr, nrow = 1) | |
| # Guard against (1 + r)^2 being zero or numerically too small | |
| denom <- (1 + r)^2 | |
| if (denom < .Machine$double.eps) { | |
| denom <- .Machine$double.eps | |
| } | |
| G <- matrix((2 / denom) * dr, nrow = 1) |
| G <- matrix((2 / (1 + r)^2) * dr, nrow = 1) | ||
| } else { | ||
| # Flanagan-Rulon: FR = 4 * C_AB / S | ||
| # dFR/d(vecSigma) = 4*(a_AB * S - C_AB) / S^2 |
There was a problem hiding this comment.
In the Flanagan-Rulon case, the gradient calculation involves division by S^2 (total variance squared). When S is zero or near-zero (constant data), this will cause numerical problems. Consider adding validation to check for this case and return NA for the standard error when appropriate.
| # dFR/d(vecSigma) = 4*(a_AB * S - C_AB) / S^2 | |
| # dFR/d(vecSigma) = 4*(a_AB * S - C_AB) / S^2 | |
| # Guard against S being zero or numerically near-zero, which would | |
| # cause division by S^2 to be unstable; in that case, the SE is undefined. | |
| if (abs(S) < sqrt(.Machine$double.eps)) { | |
| return(NA_real_) | |
| } |
| for (a in seq_len(J)) { | ||
| for (b in seq_len(J)) { | ||
| if (a != b) { | ||
| Gmat[a, b] <- 1 / (m * sqrt(dC[a] * dC[b])) | ||
| } else { | ||
| Gmat[a, a] <- -sum(R[a, -a]) / (m * dC[a]) | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
The gradient calculation involves division by sqrt(dC[a] * dC[b]) for off-diagonal elements and by dC[a] for diagonal elements. When any variance (dC[a] or dC[b]) is zero or near-zero, this will cause numerical problems. Consider adding validation to check for zero or near-zero variances and handle them appropriately (e.g., return NA).
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
|
@ejw would be very happy if this made it in time for the next release. |
fixes jasp-stats/jasp-issues#3803
try this file. Note that in JASP variables are split in odd and even numbered, not like SPSS into first half and second half.
sandboxL4.zip
fixes jasp-stats/jasp-issues#2155 (choose split.half and standardized and you get Spearman-Brown)
I am not sure why those two previous commits landed there...ah probably because I never renamed the branch and continued working on it..